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Abstract 

The proprietary nature of Hedge Fund investing means that it is common practise 
for managers to release minimal information about their returns. The construction 
r '■ of a Fund of Hedge Funds portfolio requires a correlation matrix which often has to 

r^ , be estimated using a relatively small sample of monthly returns data which induces 

noise. In this paper random matrix theory (RMT) is applied to a cross-correlation 
matrix C, constructed using hedge fund returns data. The analysis reveals a number 
of eigenvalues that deviate from the spectrum suggested by RMT. The components 
of the deviating eigenvectors are found to correspond to distinct groups of strategies 
that are applied by hedge fund managers. The Inverse Participation ratio is used to 
quantify the number of components that participate in each eigenvector. Finally, the 
correlation matrix is cleaned by separating the noisy part from the non-noisy part 
^ ■ of C. This technique is found to greatly reduce the difference between the predicted 

jy-^ ■ and realised risk of a portfolio, leading to an improved risk profile for a fund of 

ly-! \ hedge funds. 

o 
o 
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1 Introduction 



A Hedge Fund is a lightly-regulated private investment vehicle that may 
utilise a wide range of investment strategies and instruments. These funds 
may use short positions, derivatives, leverage and charge incentive-based fees. 
Normally, they are structured as limited partnerships or offshore investment 
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companies. Hedge Funds pursue positive returns in all markets and hence are 
described as "absolute return" strategies. 

Hedge Funds are utilised by pension funds, high net-worth individuals and 
institutions, due to their low correlation to traditional long-only investment 
strategies. The incentive-based performance fees, earned by hedge fund man- 
agers, align the interest of the hedge fund manager with that of the investor. 
The performance of Hedge Funds has been impressive, with the various Hedge 
Fund indices providing higher returns, with lower volatility, than traditional 
assets over many years. As of the end of the first quarter 2006 the total assets 
managed by hedge funds world wide is estimated at $1.25 trillion [1]. Hedge 
Funds generally only report their returns on a monthly basis and this means 
that there is a very limited amount of data available to study as databases of 
Hedge Fund returns have only been in operation for about 15 years. This is in 
keeping with the highly secretive, proprietary nature of Hedge Fund investing. 
The amount of information reported by a Hedge Fund about how and where 
it is producing its returns is often limited to sectoral overviews and strategy 
allocations. For an introduction to hedge funds see [2,3]. 

Significant diversification benefits can be gained by investing in a variety of 
hedge fund strategies, due to the presence of low and even negative correlations 
between different hedge fund strategies. Such strategies can be broken up into 
two general categories: directional and market neutral. Directional strategies, 
(for example Long/Short Equity, Emerging markets. Macro and Managed Fu- 
tures) have a high risk, high return profile and act as return enhancers to a 
traditional portfolio. Market neutral strategies, (for example Convertible Ar- 
bitrage, Equity Market Neutral and Fixed Income Arbitrage) deploy a low 
risk profile and act as a substitute for some proportion of the fixed income 
holdings in an investors portfolio, [2,3]. 

A Fund of Hedge Funds allows investors to have access to a large diverse 
portfolio of Hedge Funds without having to carry out due diligence on each 
individual manager. The diversification benefits provided by Fund of Funds are 
brought about by investing in a number of funds that have a low correlation to 
each other. These correlations are often calculated by using equally weighted 
fund returns and can contain a significant amount of noise due to the very 
small amount of returns data available for hedge funds [3] . 

In this paper we apply Random Matrix Theory to Hedge Fund returns data 
with the aim of reducing the levels of noise in these correlation matrices formed 
from this data and hence constructing a fund of hedge funds with an improved 
risk profile. Previous studies have used the information found in the RMT 
defined deviating eigenvalues of a correlation matrix as inputs into a minimum 
spanning tree [4] to enable characterisation of Hedge Fund strategies. In this 
paper the components of the deviating eigenvectors are shown to correspond 



to distinct groups of strategies that are applied by hedge fund managers and 
this is exploited to construct a portfolio with reduced levels of risk. 

This paper is organised as follows: in Section 2 we review RMT and discuss its 
use in the extraction of information from a correlation matrix of Hedge Fund 
returns using RMT techniques. In Section 3 we look at the results obtained 
applying Random Matrix theory to Hedge Funds and, in the final section, we 
draw our conclusions. 



2 Methods 



2.1 Random Matrix Theory 

Given returns Gi{t), i = 1, . . . ,N, of a collection of Hedge Funds we define 
a normalised return in order to standardise the different fund volatilities. We 
normalise Gi with respect to its standard deviation ai as follows: 

g,{t) = ^^^^l^-^^ (1) 

Where o"i is the standard deviation of Gi for assets i = 1, . . . ,N and Gi is the 
time average of Gi over the period studied. 

Then the equal time cross correlation matrix is expressed in terms of gi (t) 
C, ^ {gdt) g^t)) ■ (2) 

The elements of Gij are limited to the domain — 1 < Cij < 1, where Cij = 1 
defines perfect correlation between funds, Gij = — 1 corresponds to perfect 
anti-correlation and Gij = corresponds to uncorrelated funds. In matrix 
notation, the correlation matrix can be expressed as 

C = icG* (3) 

Where G is an A^ x T matrix with elements ga- 

The spectral properties of C may be compared to those of a "random" Wishart 
correlation matrix [8,11], 

R = IaA* (4) 



Where A is an A^ x T matrix with each element random with zero mean and 
unit variance. Statistical properties of random matrices have been known for 
many years in the physics literature [5] and have been applied to financial 
problems relatively recently [6,7,8,9,10,11,12,13,14,15,16]. 

In particular, the limiting property for the sample size N ^ oo and sample 
length T — )■ oo, providing that Q = ^ > 1 is fixed, has been examined to show 
analytically that the distribution of eigenvalues A of the random correlation 
matrix R is given by: 

p Q v/(A, - A) (A - A_) 

for A within the region A_ < Aj < A-|-, where A_ and A-)_ are given by 

(6) 




Where o"^ is the variance of the elements of G; (for G normalised this is equal 
to unity). 

A-t are the bounds of the theoretical eigenvalue distribution. Eigenvalues that 
are outside this region are said to deviate from Random Matrix Theory. Hence 
by comparing the empirical distribution of the eigenvalues of the funds cor- 
relation matrix to the distribution for a random matrix as given in Equation 
5, we can identify the deviating eigenvalues. These deviating eigenvalues are 
said to contain information about the system under consideration and we use 
eigenvector analysis to identify specifically the information present. 



2.2 Eigenvector Analysis 



The Deviations of -P(A) from the RMT result P„n{^) implies that these devia- 
tions should also be displayed in the statistics of the corresponding eigenvector 
components [8]. In order to interpret the meaning of the deviating eigenvec- 
tors, we note that the largest eigenvalue is of an order of magnitude larger than 
the others, which constrains the remaining N — 1 eigenvalues since Tr [C] = N. 
Thus, in order to analyse the contents of the remaining eigenvectors, we first 
remove the effect of the largest eigenvalue. To do this we use the linear regres- 
sion (as discussed in [11]) 

G,{t)=a, + /3,G"'''3'{t)+e,{t), (7) 



Where G''"'^^'^ = Y^i u^^^^Gi (t) and A^ = 49 is the number of funds in our 
sample. Here u"'^^'^ corresponds to the components of the largest eigenvector. 
We then recalculate the correlation matrix C using the residuals ej (t). If we 
quantify the variance of the part not explained by the largest eigenvalue as 
0"^ = 1 — \iarge/n, [8], wc cau usc this value to recalculate our values of \±. 

Using techniques for sector identification, [10], we try to analyse the infor- 
mation contained in the eigenvectors. We partition the funds into groups 
/ = 1, ... 10 as defined by the managers strategy (eg. Equity Long/Short, 
Managed Futures. See Appendix A for a complete strategy breakdown for the 
sample). We define a projection matrix Pu = — , if fund i belongs to group 
/ and Pu = otherwise. For each deviating eigenvector u'' we compute the 

r n 2 

contribution X^ = J2f=i Pu ^i of each strategy group, where this represents 
the product of the projection matrix and the square of the eigenvector compo- 
nents. This allows us to measure the contribution of the different Hedge Fund 
strategies to each of the eigenvectors. 

As suggested in [11], we also aim to assess how the effects of randomness lessen 
as we move further from the RMT upper boundary boundary, A+. To do this 
we use the Inverse Participation Ratio (IPR). The IPR allows quantification 
of the number of components that participate significantly in each eigenvector 
and tells us more about the level and nature of deviation from RMT. The IPR 
of the eigenvector u'' is given by l'^ = J2i=i ("wf ) and allows us to compute the 
inverse of the number of eigenvector components that contribute significantly 
to each eigenvector. 



2.3 Application to portfolio optimisation 



The diversification of an investment into independently fluctuating assets re- 
duces its risk. However, since cross-correlations between asset prices exist, the 
accurate calculation of the cross-correlation matrix is vitally important. The 
return on a portfolio with A^ assets is given by $ = Y.iLiU!iGi where Gi{t) 
is the return on asset i, Wi is the fraction of wealth invested in asset i and 
Y^iLi Wi = ^- The risk of holding this portfolio is then given by 

N N 

fi^ = XI XI WiWjCijCTiaj (8) 

i=i j=i 

Where (Tj is the volatility of Gi and Cij are the elements of the cross-correlation 
matrix. In order to find an optimal portfolio, using the Markowitz theory of 
portfolio optimisation [6,18,19], we minimise f2^ under the constraint that 
the return on the portfolio, $, is some fixed value. This minimisation can be 



implemented by using two Lagrange multipliers, which leads to a set of A^ 
linear equations which can be solved for Wi. If we minimise VL for a number 
of different values of $, then we can obtain a region bounded by an upward- 
sloping curve, called the efficient frontier. 

In the case of optimisation with a portfolio containing only Hedge Funds, the 
additional constraint of no short-selling is natural due to the difficulties in 
short-selling of funds; (note that short-selling may be achievable by the use of 
swaps but is uncommon) [2,3]. 

In order to demonstrate the effects of randomness on the cross-correlation 
matrix of hedge funds we first divide the time series studied into two equal 
parts. We assume that we have perfect knowledge on the future average returns 
nii by taking the observed returns on the second sub-period. We calculate 

(1) the predicted efficient frontier using the correlation matrix for the first 
sub-period and the expected returns rrii 

(2) the realised efficient frontier using the correlation matrix for the second 
sub-period and the expected returns mi. 

The portfolio risk due to the noise in the correlation matrix can then be 
calculated using 



Where Vtl is the risk of the realised portfolio and Vt^ is the risk of the predicted 
portfolio. 

It was shown in [13,14] that correlations may also be measured in the random 
part of the eigenvalue spectrum. However, since our aim here is to demonstrate 
how Random Matrix Theory can be used to improve the risk/return profile 
for a portfolio of Hedge Funds, we assume that the eigenvalues corresponding 
to the noise band in RMT, A_ < A < A+, are not expected to correspond to 
real information following [6,7,8,9,10,11,12,15,16]. We then use this assump- 
tion to remove some of the noise from the correlation matrix. Although the 
technique used in [6,8] has been shown to lead to problems with the stability 
of the correlation matrix [16], we apply it here as a simple test case to demon- 
strate how noise can be removed from the cross-correlation matrix formed 
from hedge fund returns. This technique involves separating the noisy and 
non-noisy eigenvalues and keeping the non-noisy eigenvalues the same. The 
noisy eigenvalues are then replaced by their average and the correlation ma- 
trix is reconstructed from the cleaned eigenvalues. We can then compare the 
risk of both the predicted and realised portfolios for the original and cleaned 
correlation matrices. 



3 Results 



3.1 Equally weighted Correlation Matrix 



The dataset studied here is a collection of 49 Hedge Funds with varying 
strategies over a synchronous period from January 1997 to September 2005 
(T = 105). The original dataset was much larger (approximately 1500 funds) 
but since the length of data available was much less than the number of funds 
we were forced to choose a subset. The subset chosen were the 49 funds with 
the longest track records giving us a fund to data ratio Q = 2.143. Reducing 
the dataset in this way is not unrealistic as a typical fund of hedge funds would 
monitor a subset of funds and choose a portfolio from these [2,3]. Often one 
of the criteria used in choosing this subset of investable funds would be the 
completion of a minimum track record. Other data sets, such as a portfolio 
made up of Hedge Fund strategy indices, were also studied by the authors 
with notably similar results. 



First we calculated the empirical Correlation Matrix between the funds using 
equally weighted returns and from this found the spectrum of eigenvalues. 
This is then compared with the theoretical spectrum for random Wishart 
matrices (as per equation 5), using A_ = 0.1 and A+ = 2.83. As can be seen 
from Figure 1, the bulk of the eigenvalues conform to those of the random 
matrix. There are three deviating eigenvalues, at 10.9886, 8.2898 and 2.944. 
This means that 6.1% of eigenvalues deviate from the RMT prediction which 
is consistent with the findings of [8] , where the authors argue that at most 6% 
of eigenvalues are non-noisy. 



2.5 



1.5 



0.5 



Eigenvalue Distribution 49 funds, 105 months data 



- Empirical Distribution 

- Ttieoretical Distribution 




6 8 

Eigenvalues 



10 



12 



Fig. 1. Spectral Distribution for equally weighted Hedge Fund correlation matrix 



3.2 Bootstrapping 



In order to show that there was no dependence on the choice of time period 
or the length of the series we broke the time series up into two segments 
and again compared the eigenvalue spectrum of the correlation matrix with 
that of a random Wishart matrix. For both time periods studied we found 
just two eigenvalues that deviated from the RMT prediction. As can be seen 
in Figure 2, the anomalous eigenvalue contributions are very similar for both 
periods chosen, which implies independence from the choice of time period and 
stationarity of the data. The values of the deviating eigenvalues are shown in 
table 1. 



Eigenvalue Rank 



105 Months Returns 
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10.9886 
8.2898 
2.944 



11.334 
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11.6874 
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Fig. 2. Bootstrapped spectral distribution for consecutive periods. 



3.3 Eigenvector Analysis 



Figure 3 shows the distribution of the components of the largest eigenvector 
and also the components of a typical eigenvector from the region predicted by 
RMT. As can be seen from this graph, the distribution of the components of 
the largest eigenvector are significantly different from that of an eigenvector 
chosen from the random region. The average value is much larger and the 
variance of the components much smaller than for the largest eigenvector, 
which is in agreement with [11,16] (the largest eigenvector can be interpreted 
as the 'market'). In this case the 'market' is the set of external stimuli that 
affect most hedge funds, (eg. Interest rate changes, large market (ie S&P 500 
etc) moves, margin changes etc). 



Distribution of Eigenvector Components 
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Fig. 3. Comparison of Eigenvector Components, largest Eigenvector (Grey), Eigen- 
vector from the bulk (Black) 

We then remove the effects of the largest eigenvalue using the techniques 
described in Section 2.2. This changes the value of Xmax from 2.8329 to 2.1975 
which means that 4 of the remaining largest eigenvalues are now outside the 
RMT region, (Figure 4). 

The distribution of the components of largest remaining deviating eigenvec- 
tor shows some distinctive clustering (Figure 5). In particular the Managed 
Futures, Emerging markets and European long/short equity strategies are the 
major contributors here. 



A similarly-clustered distribution also emerges for the other deviating eigenval- 
ues. An analysis of the eigenvector components for the 2nd largest remaining 
eigenvector, after removing the effects of the market eigenvector (Figure 6), 
shows distinctive clusters emerging, especially for the managed futures and 
long/short equity sectors. However, the components corresponding to the man- 
aged futures strategy do not deviate much from zero and hence make little 



contribution. These findings for clustering of the deviating eigenvalues agree 
with [17] where the authors show that, in addition to the largest, the other 
deviating eigenvalues of the correlation matrix of asset returns also contain 
information about the risk associated with the assets. 



Eigenvalue Distribution, 49 Funds, 105 montiis data montins Data 




- Empirical Distribution 

- Tineoreticai Distribution 



Eigenvalues 

Fig. 4. Eigenvalue spectrum after the removal of the effects of the largest Eigenvalue 

Distribution of Eigenvector Components, Largest Remaining Eigenvalue 
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Fig. 5. Distribution of eigenvector components, largest remaining eigenvalue 



3.4 Strategy Identification 



From the analysis described in Section 3.3, we look at the contribution, Xj' = 

2 



^N 






gies emp 



u^ , to each of the deviating eigenvalues from the different strate- 
oyed. Figure 7 shows Xf for the largest remaining Eigenvector once 
the effects of the market eigenvalue are removed. The largest contributors 



10 



Distribution of Eigenvector Components, 2nd Largest Eigenvalue 
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Fig. 6. Distribution of eigenvector components, 2nd largest remaining eigenvalue 

are clearly Managed Futures and Emerging markets. However, the strategy 
contribution for Managed Futures is only around twice that for many of the 
other sectors. Hence Managed Futures and Emerging markets are the domi- 
nant strategies but care in interpretation is needed, since neither contributor 
is dominant overall. 

Figure 8 shows the strategy contributions for the second largest eigenvalue. 
These are interesting, since three of the four dominant strategies (Asia, Global 
Equity & European Long/Short Equity) are equity strategies and would all 
be affected by events in world equity markets. Also the fourth strategy. Self 
Invested Fund of Funds, may well also consist of equity funds. However there is 
limited information available on exactly what type of funds the managers were 
invested, although there is reason to believe that a majority of them would be 
equity based. This implies that this eigenvalue seems to contain information 
just on equity funds. 

Figure 9 contains the strategy contributions for the third largest remaining 
eigenvalue. Clearly the dominant strategy here is Currency. The final deviat- 
ing eigenvalue also has one dominant strategy which is Self Invested Fund of 
Funds. It is notable in the above that analysis of the eigenvalues from within 
the random matrix region revealed no dominant strategies. The evidence of 
strategy information in the deviating eigenvalues, coupled with a lack of dom- 
inant strategies within the RMT region, supports the idea that information 
in the correlation matrix is cheifly contained within the deviating eigenval- 
ues. We show how this information can be used to create a portfolio with an 
improved risk-return profile in Section 3.6. 
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Fig. 8. Strategy Contribution, 2nd largest eigenvalue 
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Fig. 9. Strategy Contribution, 3rd largest eigenvalue 
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3.5 Inverse Participation Ratio 



Figure 10 shows the inverse participation ratio (IPR) calculated for the eigen- 
vectors of the Hedge Fund cross-correlation matrix studied. The average IPR 
value is around 0.06, larger than would be reasonably expected {j^ ~ 0.02) if 
all components contributed to each eigenvector. We would also expect that the 
largest eigenvectors contributed much more markedly. However they appear 
to have a similar IPR to eigenvectors corresponding to the random section. 
Part of the reason may be that the sample size is small, so the IPR is not 
particularly effective in terms of assessing by how much larger eigenvectors 
deviate from the random region, since the average value of the IPR relies on 
a sample size that tends to infinity. 

A significant deviation from the average IPR value is found for the first few 
eigenvalues. This seems to have been caused by the inclusion of four or five 
funds, identical apart from being either in different base currencies or lever- 
aged versions of each other with correspondingly high correlation {k. 1). As 
mentioned in [11] funds with a correlation coefficient much greater than the 
average effectively decouple from the other funds. 
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Fig. 10. Inverse Participation Ratio 
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3. 6 Noise Removal and Portfolio Optimisation 



It was noted earlier that where the time series available to estimate cross- 
correlation matrices are of finite length this leads to measurement noise. This 
problem is particularly prevalent with hedge fund data, since only monthly 
returns are available. As seen in Figure 11 the realised risk is, on average, 292% 
of the predicted risk. This has obvious consequences for risk management. 
However, by cleaning the correlation matrix as described in Section 2.3, the 
realised risk is 190% of the predicted risk. This huge improvement is brought 
about by limiting the correlation matrix to the information band prescribed 
by RMT. 

It can be seen in Figure 11 that, for some return values, the predicted risk 
(using the filtered correlation matrix) is actually less than that of the original 
correlation matrix. This is due to the constraints imposed on the portfolio, in 
particular the restriction of no 'short-selling' (Section 2.3). 

The use of the cleaned correlation matrix leads to a 35% improvement in 
the difference between the realised risk and the predicted risk for the optimal 
portfolio. So we have shown that, in the case of Hedge Fund returns data, the 
cleaned correlation matrix is a good choice for portfolio optimisation. 
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Fig. 11. Efficient Frontiers using original and cleaned correlation matrices 
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4 Conclusions 



We have illustrated that, even with limited data (105 months of returns data 
for 49 Hedge Funds), useful information can be extracted from a cross cor- 
relation matrix constructed from hedge fund returns. Significant deviations 
from Random Matrix Theory predictions are observed, with further analysis 
showing that there is real strategy information contained within the deviat- 
ing eigenvalues. Eigenvector analysis revealed distinct strategy clustering in 
the deviating eigenvectors. These included Emerging Markets and Managed 
Futures in the largest eigenvector. Equity funds in the second. Currency and 
Fund of Funds in the final two deviating eigenvectors. The strategy informa- 
tion in the deviating eigenvalues was then used to clean the correlation matrix, 
by flattening the eigenvalues from the bulk to their average and holding the 
deviating eigenvalues the same. A 35% improvement between the risk of the 
predicted and realised portfolios was found using this filtering technique. 



A Hedge Fund Strategies 

Strategies employed by the managers in the sample considered: 

Strategies Number of Funds 

Asia excluding Japan Long/Short Equities 2 

Convertible & Equity Arbitrage 2 

Currency 7 

Emerging Markets 6 

European Long/ Short Equity 10 

Fixed Income 1 

Global Equity 5 

Japan Market Neutral 1 

Macro 3 

Managed Futures 11 

Self-Invested Fund of Funds 1 
Table A.l 
Hedge Fund Strategies 
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